Spaced seed design on profile HMMs for precise HTS read-mapping - efficient sliding window product on the matrix semi-group

نویسنده

  • Laurent Noé
چکیده

We propose a new method and its associated algorithm to e ciently compute seed sensitivity when considering that High Throughput Sequencing reads are mapped along sub-parts of a known HMM alignment profile. This computation particularly makes sense with positioned spaced seeds. It relies on both automata theory (previous work [KNR06]) combined with a matrix product problem. Interestingly, it brings into light an interval product problem considered more than twenty years ago in [AS87], but here with a sliding window aspect : we propose an e cient algorithm to compute this sliding window set of products using a linear number of unit products on the (associative, but non commutative and non invertible) matrix semi-group. This computational scheme is implemented in the ongoing 1.06 version of Iedera which is available at http://bioinfo.lifl.fr/yass/iedera.php

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Designing Efficient Spaced Seeds for SOLiD Read Mapping

The advent of high-throughput sequencing technologies constituted a major advance in genomic studies, offering new prospects in a wide range of applications.We propose a rigorous and flexible algorithmic solution to mapping SOLiD color-space reads to a reference genome. The solution relies on an advanced method of seed design that uses a faithful probabilistic model of read matches and, on the ...

متن کامل

Seed-Set Construction by Equi-entropy Partitioning for Efficient and Sensitive Short-Read Mapping

Spaced seeds have been shown to be superior to continuous seeds for efficient and sensitive homology search based on the seedand-extend paradigm. Much the same is true in genome mapping of high-throughput short-read data. However, a highly sensitive search with multiple spaced patterns often requires the use of a great amount of index data. We propose a novel seed-set construction method for ef...

متن کامل

FDiBC: A Novel Fraud Detection Method in Bank Club based on Sliding Time and Scores Window

One of the recent strategies for increasing the customer’s loyalty in banking industry is the use of customers’ club system. In this system, customers receive scores on the basis of financial and club activities they are performing, and due to the achieved points, they get credits from the bank. In addition, by the advent of new technologies, fraud is growing in banking domain as well. Therefor...

متن کامل

Dosimetry limitations and pre-treatment dose profile correction for sliding window IMRT

Background: This work investigated the dosimetry limitations of the random and systematic uncertainties of sliding window (SW) intensity modulated radiation therapy (IMRT). Materials and Methods: A Varian 21EX linear accelerator, Pinnacle3 treatment planning system and radiographic film dosimetry was used. The limitations of the SW were studied using beam modulation ranging from 2 to 10...

متن کامل

PerM: efficient mapping of short sequencing reads with periodic full sensitive spaced seeds

MOTIVATION The explosion of next-generation sequencing data has spawned the design of new algorithms and software tools to provide efficient mapping for different read lengths and sequencing technologies. In particular, ABI's sequencer (SOLiD system) poses a big computational challenge with its capacity to produce very large amounts of data, and its unique strategy of encoding sequence data int...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012